Text Classification Using WordNet Hypernyms

نویسندگان

  • Sam Scott
  • Stan Matwin
چکیده

This paper describes experiments in Machine Learning for text classification using a new representation of text based on WordNet hypernyms. Six binary classification tasks of varying difficulty are defined, and the Ripper system is used to produce discrimination rules for each task using the new hypernym density representation. Rules are also produced with the commonly used bag-of-words representation, incorporating no knowledge from WordNet. Experiments show that for some of the more difficult tasks the hypernym density representation leads to significantly more accurate and more comprehensible rules.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating WordNet Features in Text Classification Models

Incorporating semantic features from the WordNet lexical database is among one of the many approaches that have been tried to improve the predictive performance of text classification models. The intuition behind this is that keywords in the training set alone may not be extensive enough to enable generation of a universal model for a category, but if we incorporate the word relationships in Wo...

متن کامل

COMPARISON OF THE EFFECTS OF LEXICAL AND ONTOLOGICAL INFORMATION ON TEXT CATEGORIZATION by CESAR KOIRALA

ON TEXT CATEGORIZATION by CESAR KOIRALA (Under the Direction of Khaled Rasheed) ABSTRACT This thesis compares the effectiveness of using lexical and ontological information for text categorization. Lexical information has been induced using stemmed features. Ontological information, on the other hand, has been induced in the form of WordNet hypernyms. Text representations based on stemming and ...

متن کامل

Lexical Inference Mechanisms for Text Understanding and Classification

This paper describes a framework for building story traces (compact global views of a narrative) and story projections (selections of key elements of a narrative) and their applications in text understanding and classification. Word and sense properties are extracted using the WordNet lexical database enhanced with Prolog inference rules and a number of lexical transformations. Inference rules ...

متن کامل

Query Refinement and User Relevance Feedback for Contextualized Image Retrieval

The motivation of this paper is to increase the user perceived precision of results of Content Based Information Retrieval (CBIR) systems with Query Refinement (QR), Visual Analysis (VA) and Relevance Feedback (RF) algorithms. The proposed algorithms were implemented as modules into K-Space CBIR system. The QR module discovers hypernyms for the given query from a free text corpus (Wikipedia) an...

متن کامل

Using WordNet Hypernyms and Dependency Features for Phrasal-Level Event Recognition and Type Classification

The goal of this research is to devise a method for recognizing and classifying TimeML events in a more effective way. TimeML is the most recent annotation scheme for processing the event and temporal expressions in natural language processing fields. In this paper, we argue and demonstrate that unit feature dependency information and deep-level WordNet hypernyms are useful for event recognitio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998